Classifying Chess Positions
نویسنده
چکیده
Chess was one of the first problems studied by the AI community. While currently, chessplaying programs perform very well using primarily search-based algorithms to decide the best move to make, in this project I apply machine-learning algorithms to this problem. Specifically, instead of choosing which move is the best to make, I want to produce an function that attempts to determine the probability that a player is likely to win in a given chess position. Note that while searching chess engines also produce a score factor for each position, this score represents the engines own belief (in a Bayesian sense) that it will win the game given the position, whereas our goal is to classify the actual probability of a win given human players. The main possible application of such a classifying function would be as a heuristic in an A*like search-based chess engine. Additionally, the structure of the classifier could shed insights on the nature of the game as a whole. I have acquired training examples from actual games played by humans. I decided to use the FICS games database, which contains over 100 million games played over the internet over a period of years. This dataset consists of games in PGN (portable game notation) format, which encodes the game as a whole rather than as a sequence of positions. Since the goal of this project is to classify positions, I needed to convert these PGN games to position sequences, and used a python script to do so. This presented a technical challenge due to the fact that a sequence of positions is several orders of magnitude larger (in terms of memory consumption) than the PGN-encoded games. By using specialized solvers, such as the stochastic subgradient method, I was able to avoid storing all the positions in memory at once. Since these positions are played by humans, and humans have a wide distribution of skill levels and play styles, any results from these data will depend on how the data are filtered. For this project, I am only pre-filtering these data by excluding (1) fast games in which the amount of time remaining for each player would be a spoiler factor for the classier, (2) games in which either player forfeited on time, (3) games in which either player forfeited due to network disconnection, (4) extremely short games, and (5) games resulting in a draw. This last exclusion is done in order to use a binary classifier for this problem; however, my approach could be extended to include drawn games. Formally, we can express this problem as a ML problem as follows: our content x(i) is a legal chess position from a game played by humans, and our annotation y(i) is the outcome of that game (a win or loss by the playerto-move). We are trying to predict the expected value of the outcome of the game given the position. Note here that, due to the fact that humans are playing these games, the result of the game is not a mathematical function of the board state. Furthermore, the nature of chess is such that the vast majority of positions encountered by the algorithm will not necessarily favor either color, so they will both not be useful as training examples for the classifier, and also raise the error rate when the classifier is tested. Because of these factors, it will be impossible for any classifier for this problem to produce a near-zero error rate over this dataset. The approach will necessitate representing the board state as some multidimensional vector of features. After investigating several possible feature sets, I settled on representing the board
منابع مشابه
Effects of the Lateral- and Double-Thinking Strategies on the Chess Positions Solving and Performance Time
Background. The game of chess, which is viewed as a symbol of intellectual prowess, is a valuable educational tool which can improve cognitive behavior such as thinking models, etc.; but the effects of thinking strategy such as double thinking strategies (DTS) and lateral thinking strategies (LTS) on the chess performance is not investigated. Objectives. This study aimed to measure the effects...
متن کاملEvaluating a Parallel Evolutionary Algorithm on the Chess Endgame Problem
Classifying the endgame positions in Chess can be challenging for humans and is known to be a difficult task in machine learning. An evolutionary algorithm would seem to be the ideal choice. We describe our implementation of a parallel island model and evaluate it in the context of the Chess Endgame data set from the UCI machine learning repository. We are mainly interested in impact of paralle...
متن کاملUniqueness in Chess Studies
Van der Heijden’s ENDGAME STUDY DATABASE IV, HhdbIV, is the definitive collection of 76,132 chess studies. In each one, White is to achieve the stipulated goal, win or draw: study solutions should be essentially unique with minor alternatives at most. In this second note on the mining of the database, we use the definitive Nalimov endgame tables to benchmark White’s moves in sub-7-man chess aga...
متن کاملRecall of Briefly Presented Chess Positions and Its Relation to Chess Skill
Individual differences in memory performance in a domain of expertise have traditionally been accounted for by previously acquired chunks of knowledge and patterns. These accounts have been examined experimentally mainly in chess. The role of chunks (clusters of chess pieces recalled in rapid succession during recall of chess positions) and their relations to chess skill are, however, under deb...
متن کاملLearning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead
In this paper we propose a novel supervised learning approach for training Artificial Neural Networks (ANNs) to evaluate chess positions. The method that we present aims to train different ANN architectures to understand chess positions similarly to how highly rated human players do. We investigate the capabilities that ANNs have when it comes to pattern recognition, an ability that distinguish...
متن کامل